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1  Statement  of  the  Problem  Studied 


The  equipment  acquired  under  this  DURIP  grant  is  aimed  for  research  in  statistical  automated  tar¬ 
get  recognition  (ATR)  using  remotely  sensed  images.  We  have  focused  on  the  problem  of  recognizing 
(known)  targets  from  their  infrared  and  video  images.  In  addition  to  the  variability  associated  with 
targets,  their  pose,  motion,  and  thermal  profiles,  the  targets  are  assumed  to  be  present  in  cluttered 
environments.  Taking  a  statistical  approach,  our  main  goal  was  to  derive  efficient  probability  mod¬ 
els  (analytical  wherever  possible)  for  these  sources  of  variability  in  the  observed  images.  Having 
obtained  such  probability  models,  we  have  derived  algorithms  for  inferences  and  also  quantified  the 
algorithmic  performance  for  specific  ATR  situations.  The  key  idea  behind  this  research  is  to  isolate 
and  model  different  physical  variables  individually,  and  derive  models/estimators  for  each  one  of 
them.  These  results  will  be  utilized  towards  the  theory  of  a  general  purpose  ATR  algorithm  which 
continues  to  be  our  research  focus. 

In  particular,  our  research  focused  on  the  following  aspects  of  ATR: 

1.  FLIR  ATR:  Infrared  images  exhibit  a  large  variability  due  to  the  varying  thermal  states  of 
the  targets.  Modeling  the  thermal  states  as  scalar  temperature  fields  on  the  target  surfaces, 
we  have  developed  a  regression  framework  for  predicting  arbitrary  FLIR  images  of  known 
targets  in  partially  observed  but  otherwise  unknown  thermal  states.  This  tool  can  be  used 
for  constructing/refining  image  database  or  directly  for  FLIR  ATR. 

2.  Clutter  Modeling  and  Classification:  Modeling  the  clutter  pixels  remains  the  most  chal¬ 
lenging  component  of  the  statistical  ATR.  Targets  of  interest  seldom  occur  alone  in  the  scenes 
and  the  presence  of  other  objects  leads  to  confusing  clutter.  We  have  derived  coarse,  tractable 
probability  models  for  these  clutter  pixels  and  have  applied  them  to  classifying  the  clutter- 
type  for  natural  images. 

3.  Bayesian  Inferences  for  ATR:  Once  we  have  the  probability  models,  the  next  task  is 
to  derive  algorithms  for  inferences  under  these  models.  Since  the  ATR  representations  take 
values  on  manifold-valued  space,  we  need  a  theory  of  inferences  on  such  parameter  spaces. 
We  have  built  upon  the  previous  work  [15]  to  develop  algorithms  for  estimation/ tracking 
of  manifold-valued  parameters  for  ATR.  Specifically,  we  have  derived  a  nonlinear  filter  for 
tracking  stochastic  processes  on  (finite  dimensional)  Lie  groups  and  their  quotient  spaces. 

4.  Performance  Analysis  of  Bayesian  ATR:  Any  inference  procedure  should  be  accompa¬ 
nied  by  its  performance  specifications,  and  we  have  derived  metrics  for  analyzing  performance 
for  the  following  ATR  tasks:  (i)  estimation  of  nuisance  parameters  such  as  pose,  location, 
thermal  state  etc.,  and  (ii)  selection  of  maximum  a-posterior  hypothesis  (target  type)  in  the 
presence  of  estimated  nuisance  parameters.  Using  asymptotic  arguments  we  have  related  the 
performance  in  target  recognition  to  the  performance  in  pose  estimation  in  an  analytical  form. 

[3,  5] 

2  Summary  of  Scientific  Progress  and  Accomplishments 

Next  we  describe  the  obtained  under  these  items. 

2.1  Prediction  of  IR  Images 

A  recent  meeting  of  the  ARO  strategy  meeting  shortlisted  a  number  of  high-priority  research  areas. 
In  the  area  of  knowledge  base  acquisition  and  refinement,  it  was  stated  that  ’’Some  means 
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of  (database)  construction  and  refinement  on  the  fly  ..  is  required.  Generation  of  relevant  multiple 
viewpoints  for  retraining  of  ATR  functionality  ...  is  desired”. 

To  pursue  database  updating  and  ’’rapid  retraining”,  we  have  developed  a  framework  for  pre¬ 
dicting  IR  images  of  a  known  target  in  a  new  thermal  state.  We  assume  a  prior  database  (mostly 
incomplete  in  terms  of  the  target’s  thermal  states)  of  target  profiles  and  some  partial  observation 
of  a  new  thermal  state.  The  goal  is  to  utilize  the  new  observation,  along  with  the  prior  database, 
to  generate  estimates  of  IR  images  from  all  other  angles,  in  order  to  update  the  database.  In  our 
approach,  the  thermal  states  of  the  target  are  represented  via  scalar  temperature  fields  and  the 
prediction  task  becomes  that  of  estimating  the  unobserved  parts  of  the  field,  using  the  observed 
parts  and  the  past  patterns.  This  estimation  is  performed  using  regression  models  for  relating  the 
temperature  variables,  at  different  points  on  the  target’s  surface,  across  different  thermal  states.  A 
linear  regression  model  is  applied  and  experiments  have  been  conducted  using  a  laboratory  target 
and  a  hand-held  IR  camera.  Shown  in  the  upper-middle  panel  of  Figure  1  is  the  target  used  in  pre¬ 
liminary  experiments;  the  upper-left  panel  shows  its  CAD  model  and  the  upper-right  panel  shows 
an  example  IR  image.  We  have  modeled  IR  images  as  Gaussian  random  fields:  the  mean  field  is 


Figure  1:  Top  panels:  CAD  model  (left),  a  video  picture  (middle),  and  an  IR  image  (right)  of  the 
target  used  in  experiments.  Bottom  panels:  an  IR  image  (left)  and  histograms  (middle,  right)  of 
pixels  in  two  homogeneous  regions  of  that  image. 


given  by  the  projection  of  3D  target  temperature  field  onto  the  2D  image  space.  The  histograms, 
in  the  bottom  panels  of  Figure  1,  display  the  pixel  variations  in  two  homogeneous  regions  of  an  IR 
image  (bottom  left  panel),  and  hence,  support  the  choice  of  a  Gaussian  model  for  the  sensor  noise. 

Instead  of  storing  past  IR  images,  we  propose  organizing  past  database  in  form  of  scalar  tem¬ 
perature  fields,  each  associated  with  a  distinct  thermal  state.  Hence,  texture  mapping  of  IR 
images  into  (scalar)  temperature  fields  becomes  important.  Observed  IR  images,  of  a  target  in  a 
fixed  thermal  state,  from  multiple  perspectives,  are  mapped  using  a  (commercial)  software  onto  the 
polygonal  representation  of  its  surface.  Given  this  texture  mapping,  one  can  synthesize  an  IR  image 
of  this  target  in  this  thermal  state,  from  an  arbitrary  perspective.  Shown  in  the  bottom  panels 
(Figure  2)  are  some  example  images  synthesized  for  a  thermal  state  captured  by  six  IR  images  (two 
of  them  are  shown  in  the  top  panels).  Using  this  procedure,  we  can  generate  a  temperature  field 
(via  a  texture  map)  for  any  previously  observed  thermal  state  of  the  target.  Repeating  this  process 
for  a  number  of  thermal  states,  we  obtain  temperature  fields  for  a  number  of  previously  observed 
thermal  states;  A  principal  component  analysis  of  these  fields  result  in  a  compact  prior  database 
of  the  thermal  states,  for  use  in  prediction  of  IR  images  associated  with  the  future  thermal  states. 

Now  consider  the  problem  of  estimating  IR  images,  from  multiple  viewpoints,  of  a  target  in  a 
partially  observed  thermal  state.  To  setup  the  prediction  experiment,  the  target  was  imaged  in  a 
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Figure  2:  Texture  mapping  of  IR  images  on  the  target  surface  to  estimate  the  temperature  field. 
Top  panels:  original  IR  images,  bottom  panels:  synthesized  IR  images  using  texture  maps. 


Figure  3:  Upper  panels:  IR  images  of  the  target  at  prior  thermal  states,  denoting  the  prior  database 
Bottom  panels:  Estimation  of  an  IR  image  using  a  noisy  sub-image  of  the  original. 


new  thermal  state.  In  order  to  study  the  algorithmic  performance,  we  have  added  white  Gaussian 
noise  (with  a  standard  deviation  of  n)  to  this  image,  in  addition  to  the  already  existing  sensor 
noise.  We  have  simulated  partial  observations  by  selecting  a  sub-image  from  the  original  image  and 
then  using  it  in  our  regression  algorithm  to  compute  the  remaining  thermal  field.  Shown  in  the 
bottom  left  panel  of  Figure  3  is  an  image  of  the  target  in  the  true  underlying  thermal  state  and 
the  same  image  with  added  white  Gaussian  noise  is  shown  in  bottom,  second  panel.  The  selected 
sub-image  is  shown  in  bottom,  third  panel  and  the  estimated  image  is  shown  in  bottom  right.  To 
analyze  estimation  performance,  we  compute  the  matrix  2-norm  between  the  estimated  and  the 
original  image  (alternative  performance  metrics  can  be  substituted  instead).  Let  p  denotes  the 
fraction  of  the  pixels  selected  in  the  sub-image,  compared  to  the  original  image.  Shown  in  Figure 
4  is  the  plot  of  expected  relative  error  versus  cr,  for  p  =  1.12%,  2.93%,  9.21%,  and  15.29%.  As 
expected,  the  relative  error  decreases  as  p  increases  and  the  error  increases  with  a.  If  the  estimated 
temperature  field  is  found  to  be  significantly  different  from  the  current  database,  it  can  be  included 
in  the  database  for  database  updating.  This  idea  can  also  be  used  for  database  refinement,  where 
a  subset  of  the  prior  database  is  selected  according  to  the  current  observations. 

For  details  regarding  this  approach,  please  refer  to  the  article  [14].  This  research  is  being 
performed  in  collaboration  with  Dr.  Richard  Sims  of  AMCOM  and  a  graduate  student,  Brian 
Thomasson,  of  FSU. 
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Figure  4:  Variation  of  estimation  error  in  IR  image  prediction  versus  the  additive  noise  standard 
deviation  in  observed  image,  for  four  different  values  of  p. 


2.2  Clutter  Modeling  and  Classification  Metrics 

Given  an  observed  image  of  a  target,  imaged  in  a  cluttered  environment,  one  would  like  to  charac¬ 
terize  the  clutter  to  the  extent  that  it  improves  the  ATR  performance.  Some  knowledge  of  clutter 
type,  whether  it  is  grass,  buildings,  trees,  or  roads,  can  help  improve  the  task  of  target  recognition. 
We  have  derived  coarse  analytical  models  for  representing  image  spectra  and  imposed  L2  metric  on 
them  to  quantify  image  differences.  An  emerging  approach,  to  representing  and  analyzing  images, 
is  to  decompose  them  into  their  spectral  bins,  i.e.  perform  band-pass  filtering  of  the  images  into 
different  frequency  bands,  and  then  to  study  the  statistics  of  these  components.  Studies  (e.g.  [1]) 
have  shown  that  the  human  visual  system  also  decomposes  images  into  such  frequency  components. 
For  implementation,  these  components  are  computed  using  linear  filters,  each  tuned  to  a  different 
frequency,  scale  and  orientation;  a  formal  framework  for  such  spectral  analysis  was  introduced  in 
Gabor  [2].  The  marginal  densities  (histograms)  of  the  components  have  often  been  chosen  as  the 
sufficient  statistics,  and  have  been  successfully  applied  to  modeling,  analysis  and  even  synthesis 
of  homogeneous  textures  [7,  16,  6].  Let  F^\  j  =  1,2, . ..  ,k,  be  the  linear  filters  that  are  used  to 
decompose  an  image  I  into  its  spectral  components.  In  this  paper,  we  have  utilized  the  Gabor 
filters  although  other  such  filters  can  also  be  used.  We  require  that  the  filters  be  chosen  such  that 
the  spectral  components  have  marginal  densities  that  are:  (i)  unimodal  with  a  mode  at  zero,  and 
(ii)  symmetric  around  zero.  Then,  /G)  =  I  *  is  a  spectral  component  of  the  image  I,  where 
*  denotes  the  2D  convolution  operation.  We  are  interested  in  an  analytical  form  that  models  the 
probability  density  of  the  pixels  in  I^\  Mathematically,  an  image  pixel  is  modeled  as 

I(z)  =  Yai 9i(z  ~  ziYz  =  [x  y}Tizi  =  [®*  Vi]T  •  (!) 

i 

Here  z  is  the  variable  for  pixel  location,  gi  is  a  profile  of  a  randomly  chosen  object,  and  ads  are 
random  weights  associated  with  different  profiles,  ad s  are  modeled  as  i.i.d.  standard  normal  and 
the  locations  Z{  s  as  modeled  as  samples  from  a  2D  Poisson  process,  with  a  uniform  intensity  A 
(independent  of  ads).  Consider  a  pixel  in  the  component  I^\ 

Iu)(z)  -  Y  ai9ij\z  -  zi)  )  where  g^  =  FCj)  *  gt  .  (2) 
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/0)(z)  is  a  random  variable  and  we  want  to  characterize  its  randomness.  The  conditional  density 
of  /(%),  given  the  Poisson  points  {zi}  and  the  profiles  {<?;},  is  iV(0,  u ),  where  the  random  variable 

u  is  defined  as  the  quantity  —  ^))2- 

For  general  cases,  with  completely  unknown  objects  in  the  image,  a  broad  family  of  distri¬ 
butions,  not  relying  on  a  prior  knowledge  of  the  gf  s,  is  required,  u  has  some  distribution  on 
the  positive  real  line  and,  motivated  by  empirical  studies,  we  model  it  by  a  scaled  F-density: 
fu{u)  =  £p^j(u/c)p_1  exp(— u/c);p, c  E  M+,  where  p  is  called  the  shape  parameter  and  c  is  called 
the  scale  parameter  of  u.  Now  we  can  integrate  over  the  Poisson  and  profile  variation,  and  derive 
the  marginal  density  of  I^\z).  This  density  is  given  by: 

Theorem  1  Under  the  proposed  model,  the  density  function  of  I^\z)  is:  for  p  >  0  and  c  >  0, 


f(t\P,c ) 


1 

v/ir(p)(2c)?+4 


(3) 


where  K  is  the  modified  Bessel  function. 

We  call  this  structure  of  /  as  a  Bessel  form.  Let  V  denote  the  set  of  all  such  Bessel  forms: 
V  =  {/(•;/),  c)\c,p  e  1R+}.  As  stated  in  [4],  the  shape  parameter  p  provides  some  idea  about  the 
nature  of  the  component  1^1  (and  hence  about  I).  For  p  =  1,  /(f;  l,c)  is  the  density  of  a  double 
exponential  or  the  exponential  model.  In  general,  f(t;  p,  c )  is  the  pth  convolution  power  of  the  double 
exponential  density.  Ifp  >  1,  we  call  it  super-exponential  model,  and  we  get  closer  to  the  Gaussian, 
especially  ifp  >>  1.  On  the  other  hand,  ifp  <  1,  we  call  it  sub- exponential  model,  the  cusp  of  the 
density  at  zero  becomes  more  pronounced. 

How  to  estimate  a  Bessel  form  for  a  given  spectral  component  1^7  Since  the  probability 
density  /  takes  a  parametric  form,  with  parameters  p  and  c,  this  task  reduces  to  that  of  estimating 
p  and  c  under  an  appropriate  criterion.  We  have  utilized  a  maximum-likelihood  estimation  (MLE) 
procedure  to  estimate  p  and  c,  according  to: 

3  „  varianc  e(I^)  ... 

P  —  - - -  Q  —  -  ^  (4) 

kurtosis  (/(•?))  —  3  P 


where  variance  and  kurtosis  are  the  sample  variance  and  the  sample  kurtosis  of  the  elements  of 
/G),  respectively.  We  illustrate  some  estimation  results  for  natural  images. 

•  Shown  in  the  top  panels  of  Figure  5  are  some  real  images  taken  from  the  Groningen  database. 
The  middle  panels  display  their  specific  filtered  forms  (or  the  spectral  components)  for  Gabor 
filters  chosen  at  arbitrary  orientations,  and  the  bottom  panels  plot  the  marginal  densities.  On 
a  log  scale,  the  observed  densities  (histograms)  are  plotted  in  broken  lines  and  the  estimated 
Bessel  forms  ( f(x;p,c ))  are  plotted  in  solid  lines. 

•  Shown  in  Figure  6  is  another  example.  For  the  image  shown  in  the  top  panel  we  have 
computed  the  observed  and  the  estimated  marginals  for  a  number  of  Gabor  filters.  The 
middle  panels  plot  the  marginals  for  different  filter  orientations  (9  =  30, 60, 90, 120,  and  150 
degree)  while  keeping  the  scale  fixed  at  a  =  4.0,  and  the  bottom  panels  are  for  different  filter 
scales  (a  =  1,2,  3,4,  and  5)  keeping  the  orientation  fixed  at  9  =  150. 

In  our  experiments,  we  have  found  a  remarkable  fit  between  the  observed  and  the  estimated 
marginals,  for  a  large  set  of  filtered  natural  images. 

To  quantify  the  distance  between  two  Bessel  forms,  we  have  chosen  the  L2-metric  on  V.  It  is 
possible  that  other  metrics,  such  as  the  Kullback-Leibler  divergence  or  the  L1  metric,  may  be  more 
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Figure  5:  Images  (top  panels),  their  spectral  components  (middle  panels),  and  the  marginal  den¬ 
sities  (bottom  panels).  The  observed  densities  are  drawn  in  broken  lines  and  the  estimated  Bessel 
forms  are  drawn  in  solid  lines. 


appropriate.  L2  is  a  common  choice  for  search/optimization  problems  and  also  leads  to  a  relatively 
simple  expression.  The  main  drawback  of  this  choice  is  that  the  Bessel  forms  are  not  in  L  foi 
p  <  0.25  and  therefore  the  metric  is  not  applicable  to  those  cases. 

Theorem  2  The  L2 -distance  between  the  two  Bessel  densities ,  parameterized  by  (pi,  cx)  and  (p2, c2), 
respectively ,  is  given  by:  for  px,p2  >  0.25,  cx,C2  >  0, 


d(pi,ci,p2,c2)  = 


— h=r(o.5) 

2v/2tF  V 


g(2p2)  _  2g(pi  +P2)/Cl^aP 
y/C2  \/ci  C2  ) 


(5) 


where  Q{p)  =  and  T  =  F((Pl  +  p2  -  0.5),p2;Pi  +P2;  1  -  |)  ^  w  the  hypergeometric 

function ). 

Theorem  1  provides  a  metric  between  two  Bessel  forms,  or  between  two  spectral  marginals.  It 
can  be  extended  to  a  metric  on  the  image  space  as  follows.  For  any  two  images,  Jx  and  J2,  and  the 
filters  F^l\ . . . ,  F^K\  let  the  parameter  values  be  given  by:  {p^\  c^)  and  (p2;  >C2J  )?  respectively, 
for  j  =  1, 2, . . . ,  K.  Then,  the  L2-distance,  between  the  spectral  representations  of  the  two  images, 

is  defined  as:  _ _ _ _ _ 

Consider  the  images  of  natural  clutter  shown  in  Figure  7.  For  a  simple  illustration,  let  the  images 
in  the  top  row  be  the  training  images  that  are  already  classified,  and  the  bottom  row  be  the  images 
that  are  to  be  classified.  Using  nine  small-scale  Gabor  filters  (K  =  9),  for  nine  different  orientations 
at  a  fixed  scale,  we  have  computed  the  pairwise  distances  d/’s.  These  distances  are  shown  in  the 
table  below: 
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Figure  6:  Plots  of  observed  and  estimated  marginals  (on  a  log  scale)  of  the  spectral  components  of 
a  given  image  (top  panel).  Middle  panels  depict  the  marginals  for  different  filter  orientations:  30, 
60,  90,  120,  and  150,  while  the  bottom  panels  are  for  different  filter  scales:  1,  2,  3,  4,  and  5. 


h 

h 

h 

h 

h 

h 

h 

h 

h 

ho 

h 

0.00 

0.51 

0.61 

0.63 

0.59 

0.64 

0.85 

0.92 

0.74 

0.70 

h 

0.51 

0.00 

0.59 

0.61 

0.60 

0.65 

0.86 

0.93 

0.71 

0.67 

h 

0.61 

0.59 

0.00 

0.46 

0.69 

0.72 

0.89 

0.96 

0.67 

0.62 

u 

0.63 

0.61 

0.46 

0.00 

0.71 

0.74 

0.90 

0.97 

0.63 

0.57 

h 

0.59 

0.60 

0.69 

0.71 

0.00 

0.48 

0.81 

0.89 

0.78 

0.76 

h 

0.64 

0.65 

0.72 

0.74 

0.48 

0.00 

0.78 

0.87 

0.81 

0.78 

h 

0.85 

0.86 

0.89 

0.90 

0.81 

0.78 

0.00 

0.76 

0.94 

0.92 

h 

0.92 

0.93 

0.96 

0.97 

0.89 

0.87 

0.76 

0.00 

0.99 

0.98 

h 

0.74 

0.71 

0.67 

0.63 

0.78 

0.81 

0.94 

0.99 

0.00 

0.48 

ho 

0.70 

0.67 

0.62 

0.57 

0.76 

0.78 

0.92 

0.98 

0.48 

0.00 

Using  the  nearest  neighbor  approach,  and  the  metric  dj  listed  in  the  table,  we  can  correctly 
associate  the  test  images  with  the  corresponding  training  images.  With  a  careful  choice  of  filters, 
one  can  view  dj  as  a  perception  metric,  i.e.  a  metric  that  seems  to  match  well  with  our  perception. 
To  illustrate  the  classification  of  clutter  types,  we  have  plotted  a  clustering  chart  in  the  left  panel 
of  Figure  8  using  the  dendrogram  function  in  matlab.  This  function  generates  a  clustering  tree 
for  points  in  image  space  when  their  pairwise  distances  are  given.  The  clustering  of  I\  with  J2,  h 
with  I4,  and  so  on,  demonstrates  the  success  of  this  representation  and  the  metric  chosen.  For  a 
quick  comparison,  a  dendrogram  clustering,  using  the  Euclidean  distances  on  the  image  space  (i.e. 
\\Ii  -  J2||2,  where  ||  •  ||  is  the  Frobenious  norm),  is  shown  in  the  right  panel.  Clearly,  the  Euclidean 
metric  does  not  provide  a  satisfactory  clustering.  These  results  show  that  through  Gabor  filtering, 
the  Bessel  forms  retain  enough  information  to  associate  similar  objects,  to  classify  clutter  type  in 
cluttered  ATR. 

For  details  please  refer  to  the  articles  [4,  12].  This  research  is  being  performed  in  collaboration 
with  Prof.  Xiuwen  Liu  of  FSU  and  Prof.  Ulf  Grenander  of  Brown  University. 
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h  h  h  h  ho 


Figure  7:  Ten  natural  images  from  the  Groningen  database:  top  row  are  the  training  images  and 
bottom  row  are  the  test  images. 


Figure  8:  Dendrogram  classification  of  images  shown  in  Figure  7.  Left  panel:  using  image  metric 
d[,  right  panel:  using  Euclidean  metric  on  image  pixels. 


2.3  Bayesian  Filtering  for  Estimation/Tracking  on  Manifolds 

In  ATR  and  many  other  signal/image  processing  applications,  the  parameters  of  interest  are  of¬ 
ten  constrained  to  take  values  on  manifolds.  In  this  research,  we  have  addressed  the  problem  of 
tracking  manifold- valued  parameters  using  a  nonlinear,  non-Euclidean  filtering  approach.  To  estab¬ 
lish  a  filtering  framework,  the  system  evolution  is  represented  by  trajectories  on  a  manifold  and  a 
dynamics-based  state  equation  is  imposed  on  that  space.  This  prior  dynamic  model  combined  with 
a  likelihood  function  forms  a  time- varying  posterior  density  on  the  manifold,  to  allow  for  Bayesian 
filtering  and  estimation.  Using  a  sequential  Monte  Carlo  method,  or  particle  filtering,  a  recursive 
procedure  is  derived  for  propagating  an  estimate  of  this  posterior  (through  random  samples)  in 
time.  Posterior  samples  are  then  utilized  to  estimate  the  unknown  parameters. 

Let  S  be  the  manifold  on  which  the  parameter  of  interest  lies:  for  rigid  target  tracking  S  is 
the  Euclidean  group  of  rigid  motions,  and  for  principal  component  tracking  S  is  the  Grassmann 
manifold.  For  discrete  observation  times  t  =  1, 2, . . .,  the  system  trajectory  is  given  by  the  sequence 
si,S2,...  €  5,  and  let  the  observation  sequence  be  given  by  Yi,  IL, —  Given  the  observation 
sequence  Yi;(  =  {Yj, . . . ,  Yt},  the  goal  is  to  estimate  the  sequence  s1:t  =  {si, . . . ,  sj  €  Sl  using  a 
minimum  mean-squared  error  (MMSE)  criterion.  The  nonlinear  filtering  equations  are  given  by, 
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for  t  =  2, 3, . . . 


f(st\Yut-i)  =  j  /(5t|st-i)/(st_i|y1:t_i)7(<fst_i)  , 


f(st\Yl;t)  = 


HYtlsMistlYut-J 


f(Yt\Y1:t- 1) 

The  filtering  problem  was  studied  for  the  following  two  applications: 


(7) 

(8) 


1.  Face  Tracking  on  Euclidean  Group:  As  an  example  of  rigid  tracking,  we  are  interested  in 
tracking  of  human  faces  from  a  video  sequence.  Using  a  deformable  template  approach,  the 
target  motion  is  tracked  by  tracking  the  rotations  and  translations  (SE(3),  special  Euclidean 
group)  that  best  match  the  synthesized  images  to  the  observed  images.  Shown  in  Figure  9  is 
an  example  of  our  face  tracking  software.  One  frame  of  the  observed  video  sequence  is  shown 
in  bottom  right  panel,  and  the  3D  template  of  the  face  is  rendered  in  top  left.  This  template 
was  generated  using  Minolta  vivid700  3D  scanner,  and  the  scanned  polygonated  surface  is 
shown  in  top-right  panel.  The  likelihood  function,  used  in  the  tracking,  is  proportional  to 
the  norm  of  the  difference  image  between  the  observed  and  the  hypothesized  (an  example  is 
shown  in  bottom-left). 


Figure  9:  Illustration  of  face  tracking:  bottom-right  is  the  video  sequence  for  tracking,  top-left  is  the 
rendering  of  out  3D  face  template,  top-right  is  the  polygonal  surface  of  the  template,  bottom-left 
the  difference  image  that  provides  a  cost  function  for  tracking. 


This  research  is  being  performed  in  collaboration  with  Prof.  Gordon  Erlebacher  of  FSU  and 
a  graduate  student  Curt  Hesher. 

2.  Principal  Component  Tracking  on  Grassmann  Manifold:  Consider  the  problem  of 
principal  subspace  tracking  in  array  signal  processing,  using  a  narrowband,  uniform  linear- 
array  (ULA)  consisting  of  n  elements  at  half-wavelength  spacing  each.  Furthermore,  assume 
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that  there  are  m  signal  transmitters,  and  the  sensor  output  is  modeled  according  to  classical 
narrowband  signal  model  [13].  The  novel  parts  of  this  research  are:  (i)  Posing  the  subspace 
tracking  problem  as  that  of  inferences  about  trajectories  on  complex  Grassmann  manifold, 
(ii)  Establishing  the  notion  of  geodesics  and  motion  variables  on  Grassmannians,  in  order  to 
impose  dynamical  models.  This  framework  allows  for  learning  the  dynamical  model  from  the 
past  trajectories  and  use  them  for  tracking  future  ones,  (iii)  Using  a  dynamic  model,  along 
with  an  observation  model,  to  treat  subspace  tracking  as  a  problem  in  nonlinear  filtering, 
(iv)  Application  of  a  sequential  Monte  Carlo  algorithm  to  approximate  the  time-varying 
posterior  density  on  the  Grassmannian.  In  addition  to  estimating  MMSE  subspaces,  this 
sampling  also  allows  for  the  estimation  of  expected  errors  and  other  posterior  moments,  for 
performance  diagnostics.  Together,  these  contributions  lead  to  a  fundamental  and  widely 
applicable  algorithm  for  subspace  tracking  or,  more  generally,  for  tracking  on  quotient  spaces 
of  finite-dimensional  Lie  groups. 

Figure  10  displays  the  tracking  results  for  two  datasets.  Each  plot  shows  the  estimation  er¬ 
ror  ||st  —  Sf |i  for  three  different  estimation  procedures.  First,  the  error  associated  with  the 
instantaneous  maximum-likelihood  estimate  (MLE),  obtained  by  SVD  of  the  instantaneous 
covariance  matrix,  is  shown  in  the  broken  line.  The  error  resulting  from  an  adaptive  proce¬ 
dure,  relying  on  the  SVD  of  a  covariance  matrix  (using  data  over  a  sliding  window)  is  shown 
in  the  dotted  line.  Finally,  the  estimation  error  for  tracking  from  our  method  is  plotted  in 
bold. 

Estimation  Porformaneo  Estimation  Parlcxmanea 


Figure  10:  These  panels  plots  the  error  in  subspace  tracking  (||Pj  —  Pt||)  as  a  function  of  t  for:  (i) 
MLE  (broken  line),  (ii)  adaptive  tracking  (dotted  line),  and  (iii)  Bayesian  tracking  (solid  line). 


For  details,  please  refer  to  the  articles  [9,  11,  10].  This  research  is  being  performed  in  collab¬ 
oration  with  Prof.  Eric  Klassen  of  Department  of  Mathematics,  FSU. 

2.4  Asymptotic  Bayesian  ATR  Performance  Analysis 

To  recognize  a  target,  estimation  of  the  associated  target  attributes,  such  as  pose,  motion,  lighting, 
and  thermal  profile,  becomes  essential.  Target  recognition  is  performed  through  Bayesian  hypoth¬ 
esis  testing;  for  a  given  observation  the  likelihood  ratios  are  compared  to  the  ratio  of  priors  and  a 
hypothesis  is  selected.  In  a  binary  case,  for  an  observed  image  I,  the  Bayesian  hypothesis-testing 
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Hi 

problem  is:  ^  ££75$  =  v.  In  the  presence  of  nuisance  parameters,  such  as  pose  and  lo- 

H0 

cation,  p{I\H{),  i  =  0, 1  is  defined  via  the  integral  p{I\Hi)  =  fsp(I\s,  Hi)p(s\Hi)j(ds),  where  s  is 
a  nuisance  parameter.  In  most  practical  situations,  the  integrand  is  too  complicated  to  be  com¬ 
puted  analytically.  To  obtain  analytical  expressions,  which  are  often  more  attractive,  asymptotic 
approximations  using  Laplace’s  method  has  been  derived. 


Lemma  1  For  any  a  &  A,  a  uniform  prior  (prior  is  given  by  Haar  measure)  and  an  asymptotic 
situation  (a  ->■  0),  the  likelihood  function  p(I |a)  is  given  by 


1 

W) 


^E{s,a)}-/{ds) 


(27r)m/2 

~WT 


exp 


,  a 


»i 


(2  ff2)' 


det(£^(s*,<r)) 


(9) 


where  E  =  -log(p{I\s,H),  s*  =  arginines  E(s,a)  and  E  is  the  Hessian  of  the  function  E  with 
respect  to  s. 

This  approximation  leads  to  an  analytical  form  for  the  probability  of  error  in  binary  target  recog¬ 
nition. 


Theorem  3  Assuming  the  VIDEO  sensor  model  with  additive  Gaussian  noise,  the  probability  of 
mis  classification  of  the  first  kind  (selecting  Hi  when  Ho  is  true  with  parameter  so)  is  given  by 
^L-exp(-/c2/2),  where 


*  =  (glogM  -  f  log  +  A  +  M  ~  JW»o) )  ,  <md  P  = 

\  2  det(EQl(5i,0))  ) 


\Ao  +  —  2p 


Shown  in  Figure  11  are  the  plots  for  the  probability  of  identifying  truck  when  the  actual  target 
used  in  generating  I  was  tank,  using  the  VIDEO  sensing  model.  For  comparison,  the  experimental 
approximation  of  this  error  probability  is  plotted  along  with  the  analytical  approximation  of  The¬ 
orem  1.  At  each  noise  level,  the  experimental  probability  is  computed  from  multiple  realizations  of 
additive  noise,  performing  nuisance  integration  on  S'  =  50(2)  using  trapezoidal  method  for  each 
realization,  and  finding  the  relative  frequency  of  incorrect  decisions.  The  solid  line  plots  the  ana¬ 
lytical  expression  and  the  broken  line  plots  the  experimental  approximation.  Shown  in  the  other 
three  panels  are  the  sample  VIDEO  images  of  the  tank  at  the  noise  levels  given  by  /3  =  0.01,0.02 
and  0.1. 


3  List  of  Equipment  Acquired  with  this  Grant 

The  following  items  were  purchased  through  this  DURIP  funding. 

1.  One  SGI  Octane  workstation. 

2.  One  Toshiba  laptop  computer 

3.  One  Canon  Digital  Camera 

4.  Three  desktop  Dell  personal  computers 

5.  One  Gateway  desktop  personal  computer 
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Figure  11:  Top-left  panel:  the  curves  denote  the  log-probability  of  recognizing  the  truck  when  the 
tank  is  the  true  target  (at  certain  pose),  comparing  the  experimental  results  (broken  line)  with 
analytical  estimates  (solid  line).  The  other  three  panels  show  VIDEO  images  of  tank  at  noise  levels 
given  by  f3  —  0.01  (top-right),  0.02  (bottom- left),  and  0.1  (bottom-right). 


6.  Two  Epson  800  inkjet  printers 

7.  One  HP  b/w  laser  printer 

8.  One  Panasonic  TV/ VCR  combination. 

9.  Data  cartridges,  video  tapes,  computer  cables,  hard  drives,  SGI  motherboard  repair,  SGI 
memory, 

10.  UPS  power  backup  for  SGI  and  other  computers. 

11.  Splus  and  other  statistical  software. 

12.  Books  and  software  manuals. 

In  addition  to  this  DURIP  award,  we  also  received  funding  for  acquiring  equipment  from  NSF  MRI 
award  and  the  FSU  Research  Foundation.  Combining  this  support,  we  have  developed  state  of 
the  art  laboratory  for  research  in  image  understanding.  Named  Laboratory  for  Computational 
Vision,  it  includes  researchers  from  Statistics  and  Computer  Science.  Details  about  this  laboratory 
can  be  obtained  by  visiting  http://lcv.stat.fsu.edu.  During  this  reporting  period,  our  research  has 
been  benefited  greatly  from  a  number  of  imaging  devices  that  were  acquired  for  our  Laboratory 
of  Computational  Vision.  We  have  purchased  two  Minolta  vivid700  three-dimensional  scanners 
that  can  generate  polygonal  meshes  (discretized  surface)  of  the  laboratory  targets  used  in  ATR 
experiments.  These  scanners  have  been  used  in  generating  CAD  models  for  IR  image  prediction, 
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and  for  human  face  tracking  from  video  sequences.  We  have  also  acquired  a  Raytheon  PalmIR  PRO 
thermal  imager  that  operates  in  7-14  pm  spectral  range  and  generates  images  of  size  320  x  240 
in  8-bit  BMP  format  at  a  (typical)  sensitivity  level  of  lOOmK.  The  scene  temperature  range  for 
this  camera,  relative  to  the  background,  is  500° C.  Additionally,  we  have  also  purchased  two  high- 
performance  Olympus  digital  cameras,  and  a  Sony  digital  video  camera,  that  are  being  used  in 
generating  textures,  natural  images,  and  video  sequences  needed  in  several  ongoing  projects  in  the 
lab. 
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5.  List  of  Papers  Presented  at  Meetings 

Anuj  Srivastava: 

(a)  Geometric  Tracking  on  Quotient  Manifolds,  Annual  meeting  of  Royal  Statistical  Society , 
Scotland,  UK,  July  2001. 

(b)  Analytical  Models  for  Spectral  Analysis  of  Natural  Images  Annual  meeting  of  Royal 
Statistical  Society ,  Scotland,  UK,  July  2001. 

(c)  Nonlinear  Filtering  for  Tracking  Pincipal  Subspaces,  AFOSR/AFRL  workshop  on  Non¬ 
linear  Filtering ,  Dayton,  February,  2001. 

(d)  ATR  via  Pose  and  Location  Estimation,  ARO  CIS  Annual  Review  Meeting ,  Baltimore, 
March,  1999. 

(e)  Metrics  for  Recognizing  Ground  Targets,  AMCOM/ARO  workshop  on  Metrics  for  ATR, 
Huntsville,  November,  1998. 

(f)  ATR  Performance  Analysis  and  Sensor  Fusion,  ONR/GTRI  workshop  on  Target  Tracking 
and  Sensor  Fusion ,  Atlanta,  June,  1998. 

Jayaram  Sethuraman: 

(a)  Further  properties  of  Dirichlet  measures,  presented  at  the  1998  Luckacs  Symposium 
“Statistics  for  the  21st  Century”  at  Bowling  Green  University,  Bowling  Green,  April, 
1998. 

(b)  Further  properties  of  Dirichlet  measures,  presented  at  the  International  Conference  in 
Reliability  and  Survival  Analysis  at  Northern  Illinois  University,  Dekalb,  May,  1998. 

(c)  Specification  of  Joint  Distributions  from  Marginal  and  Conditional  Distributions,  pre¬ 
sented  an  invited  paper  at  the  Symposium  on  Decision  Theory  at  Purdue  University, 
Lafayette,  June,  1998. 
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(d)  Conformation  in  Metric  pattern  Theory,  presented  at  the  Army  Statistician’s  Conference 
at  New  Mexico  State  University,  Las  Cruces,  October,  1998. 

(e)  Conformation  in  Metric  pattern  Theory,  presented  at  the  meeting  of  the  Florida  Chapter 
of  the  American  Statistical  Association  in  Gainesville,  February,  1999. 

(f)  Reduction  in  Predictive  Ability  Caused  by  Discretization  of  the  Independent  Variable  - 
Presented  at  the  Army  Statistics  Conference  at  West  Point,  October,  1999. 

(g)  Limit  Theorems  for  Models  in  Pattern  Analysis  -  Invited  talk  at  the  International  Con¬ 
ference  on  Stochastic  Processes  and  their  Applications  held  at  Cochin,  India,  December, 
1999. 

(h)  Properties  and  Approximations  of  Dirichlet  Processes  at  the  2000  Annual  meeting  of  the 
Canadian  Statistical  Association  in  Ottawa,  ONT,  June,  2000. 

(i)  Modeling  Transmission  Loss  in  a  Large  Network  -  presented  at  the  Army  Conference  on 
Applied  Statistics  held  at  Rice  University,  Houston,  October,  2000. 

5  Scientific  Personnel  Supported 

This  DURIP  funding  was  for  acquisition  of  equipment  and  no  personnel  were  support  on  this 
funding.  Anuj  Srivastava,  Co-PI,  and  a  graduate  student  Brian  Thomasson  are  supported  by  a 
separate  ARO  grant  DAAD19-99-1-0267. 

6  Reports  of  Inventions 

None. 
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